![]() |
The Twofish Encryption Algorithm: A 128-Bit Block Cipher
by Bruce Schneier ; John Kelsey ; Doug Whiting ; David Wagner ; Chris Hall ; Niels Ferguson Wiley Computer Publishing, John Wiley & Sons, Inc. ISBN: 0471353817 Pub Date: 03/01/99 |
Previous | Table of Contents | Next |
By design, the general ordering of operations in Twofish alternates as follows: 8-by-8 S-box, MDS matrix, PHT with subkey addition, and XOR. The underlying algebraic operations thus alternate between non-linear table lookup, a GF(2)-linear combination of the bits by the MDS matrix, integer addition mod 232, and GF(2) addition (XOR). Within the S-boxes, several levels of alternating XOR and 8-by-8 permutations are applied. The goal of this ordering is to help destroy any hope of using a single algebraic structure as the basis of an attack. No two consecutive operations use the same structure, except for the PHT and the key addition that are designed to be merged for faster implementations.
There are two major mechanisms for providing diffusion in the round function. The first is the MDS matrix multiply, which ensures that each output byte depends on all input bytes. The two outputs of the g functions (T0 and T1) are then combined using a PHT so that both of them will affect both 32-bit Feistel XOR quantities. The half of the PHT involving the quantity T0 + 2T1 will lose the most significant bit of T1 due to the multiply by 2. This bit could be regained using some extra operations, but the software performance would be significantly decreased, with very little apparent cryptographic benefit. In general, the most significant byte of this PHT output will still have a non-zero output difference with probability 254/255 over all 1-byte input differences.
We cannot guarantee that a single byte input change to the F function will change seven or eight of the output bytes of F. The reason is that the carries in the addition of the PHT can remove certain byte differences. For example, an addition with a constant might turn a difference of 0000018016 into 0000008016. The chances of this happening depend on the distance between the two bits that influence each other. A large reduction in the number of changed bytes is very unlikely.
Within each round, both of the 32-bit words that are XORed with the round function results are also rotated by a single bit. One word is rotated before the XOR, and one after the XOR. This structure provides symmetry for decryption in the sense that the same software pipeline structure can be applied in either direction. By rotating a single bit per round, each 32-bit quantity is used as an input to the round function twice in each of the eight possible bit alignments within the byte.
These rotations in the Twofish round functions were included specifically to help break up the byte-aligned nature of the S-box and MDS matrix operations, which we feared might otherwise permit attacks using statistics based on strict byte alignment. For example, Square uses an 8-by-8-bit permutation and an MDS matrix in a fashion fairly similar to Twofish, but without any rotations. An early draft of the Square paper proposed a very simple and powerful attack, based solely on such byte statistics, that forced the authors to increase the number of rounds from six to eight. Also, an attack against SAFER is based on the ciphers reliance on a byte structure [Knu95c].
Choosing a rotation by an odd number of bits ensures that each of the four 32-bit words are used as input to the g function in each of the eight possible bit positions within a byte. Rotating by only one bit position helps optimize performance on the Pentium CPU (which, unlike the Pentium Pro, has a one-clock penalty for multi-bit rotations) and on smart card CPUs (which generally do not have multi-bit rotate opcodes for 32-bit words). Limiting rotations to single-bit also helps minimize hardware costs, since the wiring overhead of fixed multi-bit rotations is not negligible.
There are three downsides to the rotations in Twofish. First, there is a minor performance impact (less than 7 percent on the Pentium) in software due to the extra rotate opcodes. Second, the rotations make the cipher non-symmetric in that the encryption and decryption algorithms are slightly different, thus requiring distinct code for encryption and decryption. It is only the rotations that separate Twofish from having a reversible Feistel structure. Third, the rotates make it harder to analyze the cipher for security against differential and linear attacks. In particular, they make the simple technique of merely counting active S-boxes quite a bit more complicated. On the other hand, it is much harder for the attacker to analyze the cipher. For instance, it is much harder to find iterative characteristics, since the bits do not line up. The rotates also make it harder to use the same high-probability characteristic several times, because the bits get rotated out of place. On the whole, the advantages were considered to outweigh the disadvantages.
It is possible to convert Twofish to a pure Feistel structure by incorporating round-dependent rotation amounts in F, and adding some fixed rotations just before the output whitening. This might be a useful view of the cipher for analysis purposes, but we do not expect any implementation to use such a structure.
The rotations were also very carefully selected to work together with the PHT and the MDS matrix to preserve the MDS difference properties for single byte input differences to g. In particular, for both encryption and decryption, the one-bit right rotation occurs after the Feistel XOR with the PHT output. The MDS matrix was chosen to guarantee that a 32-bit word rotate right by one bit will preserve the fact that all four bytes of the 32-bit word are changed for all single input byte differences. Thus, placing the right rotation after the XOR preserves this property. However, during decryption, the rotate right is done after the XOR with the Feistel quantity involving T0 + 2T1. Note that, in this case, the rotate right puts the 2T1 quantity back on its byte boundary, except that the most significant bit has been lost. Therefore, given a single input byte difference that affects only T1, the least significant three bytes of the Feistel XOR output are guaranteed to change after the rotation, and the most significant byte will change with probability 254/255.
The fact that the rotate left by one bit occurs before the Feistel XORs during encryption guarantees that the same relative ordering (i.e., rotate right after XOR) occurs during decryption, preserving the difference-property for both directions. Also, performing one rotation before and one after the Feistel XOR imposes a symmetry between encryption and decryption that helps guarantee very similar software performance for both operations on the Pentium CPU family, which has only one ALU pipeline capable of performing rotations.
Sixteen rounds corresponds to eight cycles, which seems to be the norm for block ciphers. DES, IDEA, and Skipjack all have eight cycles. Twofish was defined to have eight cycles (16 rounds) primarily out of pessimism. Although our best non-related-key attack only breaks five rounds of the cipher, we cannot be sure that undiscovered cryptanalysis techniques do not exist that can do better. Hence, we consider 16 rounds to be a good balance between our natural skepticism and our desire to optimize performance. Even so, we took pains to ensure that the Twofish key schedule works with a variable number of rounds. It is easy to define Twofish variants with more or fewer rounds.
Previous | Table of Contents | Next |